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Abstract. The NP-hard Minimum Common String Partition problem asks whether two strings x 
and y can each be partitioned into at most k substrings, called blocks, such that both partitions 
use exactly the same blocks in a different order. We present the first fixed-parameter algorithm for 
Minimum Common String Partition using only parameter k. 



1 Introduction 

o 

CSj Computing the evolutionary distance between two genomes is a fundamental problem in compar- 

ative genomics [5]. Herein, the genomes are usually represented as either strings or permutations 
and the task is to determine how many operations of a certain kind are needed to transform one 
genome into the other. If the input is a pair of permutations, these problems can be formulated 
as sorting problems, such as Sorting by Transpositions [2] and Sorting by Reversals pp. 
In this work, we study a problem in this context whose input is a pair of strings x and y. In- 

tyy formally, the operation to transfer x into y is to cut x into nonoverlapping substrings and to 

reorder these substrings such that the concatenation of the reordered substrings is exactly y. 
This transformation is formalized by the notion of common string partition (CSP): a partition V 
O of two strings x and y into blocks x\x<i ■ ■ ■ x^ and y±y2 • • • Vk is a common string partition if there 

is a bijection M between {xi | 1 < i < k} and {yi | 1 < i < k} such that Xi is the same string 
as M{xi) for all 1 < i < k (see Figure [T] for an example). Herein, k is called the size of the 
common string partition V . We study the problem of finding a minimum-size CSP: 

^ Minimum Common String Partition (MCSP) 

Input: Two strings x and y of length n, and an integer k. 

Question: Is there a common string partition (CSP) V of size at most k of x and yl 

o 

MCSP was introduced independently by Chen et al. [3] and Swenson et al. [10] (who call the 
problem Sequence Cover). MCSP is NP-hard and APX-hard even when each letter occurs 
at most twice [7] . Damaschke [1] initiated the study of MCSP in the context of parameterized 
algorithmics by showing that MCSP is fixed-parameter tractable with respect to the combined 
parameter "partition size k and repetition number r of the input strings" . Subsequently, Jiang 
et al. [8] showed that MCSP can be solved in (d\) k ■ poly(n) time, where d is the maximum 
number of occurrences of any letter in either input string. MCSP can be solved in 2 n • poly(n) 
time [BJ. A greedy heuristic for MCSP was presented by Shapira and Storer [S]. In this work, 
we answer an open question [H |6j |8] by showing that MCSP is fixed-parameter tractable when 
parameterized only by k, that is, we present an algorithm with running time f(k) ■ poly(n). 

Basic Notation. A marker is an occurrence of a letter at a specific position in a string; we denote 
the marker at position i in a string x by x[i]. For all i, 1 < i < n, the markers x\i\ and x[i + 1] 
are called consecutive. An adjacency is a pair of consecutive markers. An interval is a set of 
consecutive markers, that is, an interval is a set {x[i], x[i + 1], . . . , x[j]} for some i < j. We write 
[a, b] to denote the interval whose first marker is a and whose last marker is b. The length ||/|| 



Partially supported by a DAAD scholarship. 

Partially supported by a post-doc scholarship of the region "Pays de la Loire" 



ababcd I abadcbbaa I babab I ababa 




ababa I babab I abadcbbaa I ababcd 

Fig. 1. An instance of MCSP with a common string partition of size four. 

of an interval / is the number of markers it contains. Given two markers a and b in the same 
string x, we write ab to denote the signed distance between a and b, that is, ab = \\[a, b]\\ — 1 if a 
appears before b in x, and ab = — \\[b, a]\\ + 1, otherwise. Given two intervals s and t, we write 
s = t if they represent the same string of letters (if they have the same contents) and s = t if 
they are the same interval, that is, they start and end at the same position in the same string. 
Similarly, for two markers a and b we write a = b if their letters are the same, and a = b if 
the markers are identical. We say that a string s has period it if s = ptt' 1 t, where i > 1, p 
is a (possibly empty) suffix of it, and r is a (possibly empty) prefix of it. We define offset 
operators > and <: For each marker e and integer d, e' = e o d is the marker such that ee' = d, 
and e< d := e> (— d). 

2 Fundamental Definitions and an Outline of the Algorithm. 

In this section, we first present the most fundamental definitions used by our algorithm and 
then give a brief outline of the main algorithmic strategy followed by the algorithm. 

Some Fundamental Definitions. Let V = {x\X2 ■ ■ ■ xf, y\y2 ■ ■ ■ Vi\ M} be a CSP of strings x 
and y. A breakpoint of V is an adjacency in x (or y) that contains the last marker of some 
block Xi (yi) and the first marker of the next block Xi+i (yj+i). We say that V matches two 
blocks Xi and yj if M(xi) = yj. Furthermore, we say that V matches two markers a and b if a 
and b are at the same position in matched blocks. By the definition of a CSP, this implies a = b. 

The algorithm works on subdivisions of both strings into shorter parts. These subdivisions 
are formalized as follows. 

Definition 1. A splitting of a string (or an interval) z is a list of intervals [ai,&i], [02,62], . . . , 
[«m, b m ], each of length at least two, called pieces such that a\ = z[l], a J+ i = bj for all j < m, 
and b m = z[\\z\\]. 

Informally, a splitting is a partition of the adjacencies of a string (or an interval) such that each 
part contains only consecutive adjacencies. 

The strategy of the algorithm is to infer more and more information about a small CSP. To 
put it another way, it makes more and more restrictions on the CSP that it tries to construct. To 
this end, the algorithm will annotate splittings as follows: a piece is called fragile if it contains 
at least one breakpoint, and solid if it contains no breakpoint. To simplify the representation, 
the algorithm sometimes merges consecutive pieces [a^, bi] and [oi+i, (where 6j = a«+i) into 
one, that is, it removes [aj, bi] and [dj+i, from some splitting and adds the interval [a^, 
to this splitting. 

To further restrict the CSP, the algorithm finds pairs of solid pieces in x and y that are 
contained in blocks that are matched by the CSP. Accordingly, a pair of solid pieces s in 1 
and t in y is called matched in a CSP V if s is contained in a block of V that is matched to a 
block that contains t. Note that matched solid pieces may correspond to different parts of their 
blocks. For example, one piece may contain the first marker but not the last marker of its block 
in x and it can be matched to a solid piece that contains the last but not the first marker of its 
block in y. Hence, when looking at the two blocks containing the pieces, there can be a "shift" 
between the matched pieces. We formalize this as follows, see Fig. [2] (left) for an example. 
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Definition 2. Let [a, b] be a piece of a splitting of x and [c, d] be a piece of a splitting of y. The 
alignment of [a,b] and [c,d] of shift 5 is the pair of reference markers a and c>5, where 

— (-ab) <5<cd, 

— [a, b] = [c>5,c> (ab + 5)] and [c, d] = [a t> (—5), a > (cd — 5)] . 

Hence, an alignment fixes how the interval [a, b] is shifted with respect to [c, d] in the matched 
blocks that contain the intervals. That is, if [a, b] starts at position j in its block, then [c, d] 
starts at position j — 5. For matched solid pieces, an alignment thus fixes which markers are 
matched to each other by the CSP. In particular, the marker a is matched to c> 8 and c is 
matched to a < 5. Note that the maximum and minimum values allowed for 5 ensure that there 
is at least one marker in [a, b] that is matched to a marker in [c, d] by a CSP corresponding to 
this alignment. The algorithm will only consider such alignments between matched solid pieces. 
The second condition verifies that all pairs of matched markers indeed correspond to the same 
letter. Clearly, this restriction is fulfilled by every CSP that does not put breakpoints in the 
solid pieces [a, b] and [c, d]. A pair of matched solid pieces is called fixed if it is associated with an 
alignment (equivalently, with a pair of reference markers) and repetitive otherwise (the reason 
for choosing this term will be given below). For a fixed solid piece s, we use s* as shorthand for 
the uniquely determined reference marker of the alignment of s which is in the same string as s. 

These restrictions on a possible CSP are summarized in the notion of constraints, defined 
as follows, see Fig. [2] (right) for an example. 

Definition 3. A constraint C is a tuple (S, F, M, Rs) such that: 

— S is a set of solid pieces. Let S x (S y ) denote the pieces of S from x (y). 

— F is a set of fragile pieces. Let F x (F y ) denote the pieces of F from x (y). 

— The pieces of S x U F x (S y U F y ) form a splitting of x (y) in which solid and fragile pieces 
alternate. 

— M : S x —7- S y is a matching, that is, a bisection between S x and S y . As shorthand, we 
write s' = M(s) if s E S x and s' = M~ 1 (s) if s G S y . 

— Rs is a set of alignments that contains for each matched pair of solid pieces at most one 
alignment. 

Our algorithm will search for CSPs that satisfy such constraints. 

Definition 4. A CSP V satisfies the constraint C = (S, F, M, R s ) if: 

1. All breakpoints ofV are contained in fragile pieces. 

2. Each fragile piece contains at least one breakpoint from V. 

3. Matched solid pieces are contained in matched blocks in V . 

4- If s is a fixed solid piece, then markers s* and s' are matched in V . 

5. If s is a repetitive solid piece, then s, s' and the blocks containing them in V all have the 
same shortest period. 

Equivalent formulations of Conditions [l] and [2] are that (1') all solid pieces are contained in 
blocks of V, and (2') different solid pieces in the same string are in different blocks. Given a 
CSP V that satisfies a constraint C, we call a block short, or undiscovered by C, if it does not 
contain a solid piece (equivalently, if it is contained in a fragile piece). The other blocks are 
called long or discovered by C. 

Finally, we introduce the following notion that concerns reference markers and fixed solid 
pieces. 

Definition 5. Let s and s' be fixed matched solid pieces in x and y. Two markers a in x and b 
in y are equidistant from s if s*a = s'*b. Similarly, two intervals [a,b] in x and [c,d] in y are 
equidistant from s if a and c are equidistant from s and b and d are equidistant from s. 
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Fig. 2. Left: Example of alignment between two pieces s and s' . Reference markers are marked with a star, the 
shift is 2. Intervals having the same content as the pieces according to this alignment are marked in gray. Note 
that there also exists an alignment of shift —3, where the reference marker in y is the first occurrence of a. 
Right: A constraint with three pairs of solid pieces illustrated by boxes. Two of these pairs are fixed and one is 
repetitive (rep). Matched solid pieces are linked with edges. The fragile pieces (red and dashed lines) contain the 
breakpoints (red crosses) of a size-5 CSP satisfying the constraint. 

We will use it to talk about the "local environment" of the reference markers in both strings. In 
particular, with this notation we can identify (sets of) markers that are matched to each other 
if they are both in the same block as the reference markers. 

An Outline of the Algorithm and its Main Method. We now give a high-level description of the 
main idea of the algorithm; the pseudo-code of the main algorithm loop is shown in Algorithm TP 
For the discussion, assume that the instance is a yes-instance, that is, there exists a CSP V 
of size k. Since we can check in polynomial time the size and correctness of any CSP before 
outputting it, we can safely assume that the algorithm gives no output for no- instances; hence 
the focus on yes-instances. The algorithm gradually extends a constraint that is satisfied by a 
solution V and outputs V eventually. Initially, the constraint consists solely of two fragile pieces, 
one containing all of x and one all of y. We assume that the input strings are not identical. 
Hence, every CSP has at least one breakpoint and the initial constraint is thus satisfied by every 
size- A; CSP. 

The algorithm now aims at discovering the blocks of V successively, from the longest to the 
shortest. Recall that a block is called discovered by a constraint C if there is a solid piece in C 
that is contained in this block. To execute the strategy of finding shorter and shorter blocks, the 
algorithm needs some knowledge about the approximate (by a factor of 2) length of the longest 
undiscovered block in "P. To this end, the algorithm keeps and updates an integer variable /3 
which has the following central property: Whenever there is a size-/c CSP satisfying the current 
constraint, then there is in particular one size-A: CSP V such that 

1. the longest short block of V has length t with f3 < £ < 2/3, and 

2. /3 is minimum among all integers satisfying Property 1. 

Accordingly, we call a block (3-criticaM it has length I with /3 < I < 2/3. To obtain /3, we 
consider all subsets U' of the set II containing all powers of 2 that are smaller than n. One of 
these sets will contain the "correct" approximate block lengths. The central strategy is: Set j3 
to be the largest value in II' . Discover all /3-critical blocks. Then, there is a satisfying CSP such 
that all undiscovered blocks are shorter than the current /3. Thus update /3 by taking the next 
largest value from II' . Then, again discover all /3-critical blocks, update /3 again and so on. 

First, note that there is at least one block of length at least \n/k~\ since V has size k, so 
max 77' > \n/2k~\. Furthermore, for any CSP of size k, \II'\ < k. Hence, the outer algorithm 
loop of Algorithm [T] is traversed once for the correct II'. Note furthermore, that the number of 
subsets of II is O(2 logra ) = 0(n). Hence, there are 0(n) traversals of the outer loop of the main 
method. 

Consider now the traversal for the correct set II'. The inner loop of the algorithm consists 
of two main steps. In the first step, called split, the algorithm discovers the /3-critical blocks. 
More precisely, it refines C by breaking fragile pieces into shorter pieces (of length [/3/3]) and 

3 Parts of this algorithm, in particular the split procedure follow somewhat the approach of Damaschke [J. 
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Algorithm 1 The main algorithm loop MCSP(x, y, k). 



1 

2 
3 

4 
5 
6 
7 
8 
9 
10 



77 := {i e N I i < n A 3j £ N : 2 J = i} 

C := {S := 0, F := {[x[l], x[n]], [y[l], y[n]]}, M := 0, R s 

for each 77' C 77 with max77' > \n/2k] A |77'| < k : 



0} // initially only two fragile pieces 



13 <- max77'; 77' <- 77' U {0} \ {/3} 
repeat until /3 < 4 : 
split 

/3 <- max77'; 77' ^ 77' \ {/?} 
frames 

branch into all cases to set breakpoints within fragile pieces 
if the resulting string partition V is a size-fc CSP : output V 



II 2-approx. length of longest block 

II discover blocks of length at least /3 
II update 2-approx. length of longest undiscovered blocks 
II reduce length of fragile pieces 



identifying those that are contained in /3-critical blocks. It then produces a matching and, if 
this is possible without considering too many options, aligns these blocks. 

To be efficient split requires that the input fragile pieces are short enough compared to /3 
and k. Initially, this is not a problem, since the fragile pieces have length n, and /3 > n/2k. 
After split, however, we update /3. Hence, between two calls to split the fragile pieces have 
to be reduced in order to fit the undiscovered blocks more "tightly". This is the objective of 
frames, which uses a set of rules to identify smaller intervals containing all breakpoints of P. It 
thus shrinks the fragile pieces of C so that they are sufficiently small for the next call to split. 

The algorithm now continues with this process for smaller and smaller values of j3. It stops 
in case /3 < 4, since it can then locate all breakpoints by applying a brute-force branching. Note 
that in order to ensure that there is always a /3 < 4, we add the value to set W in Line 4 of 
the main method. 

In the remainder of this work, we give the details for the procedures split and frames. 
In Section [3j we describe the split procedure, and show its correctness. We also show, using 
several properties of frames as a black box, our main result. Then, in Sections [4] and [5j we fill 
in the blanks by proving the properties of frames. 

The algorithm is a branching algorithm that extends the constraint C in each branch. In 
order to simplify the pseudo-code somewhat, we describe the algorithm in such a way that the 
variables C and /3 are global variables. After a branching statement in the pseudo-code, the 
algorithm continues in each branch with the following line of the pseudo-code. If a branch is 
known to be unsuccessful, then the algorithm returns immediately to the branching statement 
that created this branch (or to the branching statement above, if the current branch is the last 
branch of that statement). We denote this by the "abort branch" command; all modifications 
within this branch are undone. 



3 Splitting of Fragile Pieces 

In this section, we describe the procedure split and show its correctness. The pseudo-code 
of split is shown in Algorithm [2| At the beginning of split the constraint contains a set of 
discovered blocks. Assume that all blocks of length at least 2/3 are discovered by this constraint. 
The aim of split now is to perform a branching into several cases such that in at least one of 
the created branches the constraint C now additionally contains all /3-critical blocks. Hence, in 
this branch all blocks of length at least (5 are discovered. Procedure split starts by replacing 
each former fragile piece by a splitting where all new pieces have length |"/3/3] except for the 
rightmost new piece of each such splitting which can be shorter. We call such a splitting a 
[~/3/3] -splitting. It then considers all branches where each piece is either fragile or solid. In order 
to maintain the alternating condition, consecutive solid (resp. fragile) pieces are merged into 
one solid (fragile) piece, Lines 7-9. 

Next, split extends the matching and the set of alignments of the constraint. All possible 
matchings are considered in separate branches (Lines 12-14). Then, split performs an exhaus- 
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Algorithm 2 Procedure split. Global variables: C = (S,F,M,Rs) and j3. 

1 N := // the set of new pieces 

2 for each fragile piece / 6 F : 

3 F <— F\ {/} / / old fragile pieces are removed 

4 N N U "[/3/3] -splitting of /" / / update set of new pieces 

5 for each p g N : / / make p either fragile or solid 

6 branch into the case that either S <s— S U {p} or F <— F U {p} 

7 while 3 consecutive pieces si,S2 s.t. {si,S2} C 5* (or {si,S2} C F) : 
# p := "merged interval of s 1 and S2" 

9 S«- (SUp)\{si,s 2 } (orFf- (FUp)\{ Sl ,s 2 }) 

10 if l&l 7^ \S y \ : abort branch // no bisection of solid pieces exists 

11 if \F X \ > k or \F y \ > k : abort branch// too many fragile pieces in x (or y) 

12 while 3 unmatched solid piece s £ S x ■ 

13 for each unmatched solid piece t in S y : 

14 branch into the case that M(s) := t 

15 for each new pair (s,t) of matched solid pieces : 

16 i := "number of alignments with shift 8 s.t. \8\ < [73/3]" 

17 if i < 6 : for each alignment branch into the case to add this alignment to R$ 

18 else: branch into the cases to: // s and s' are periodic 

- align s and s' such that lbreak(s) and lbrcak(s') are equidistant from s 

- align s and s' such that rb rea k(s) and rb re ak(s') are equidistant from s 

- do not align s and s' 



tive branching over all alignments for a given pair of solid pieces, but only if there are very 
few of them (Line 17). If there are too many (Line 18), then it can be seen that the pieces are 
periodic with a short period length. Thus, the blocks containing them might be periodic as well. 
If the blocks are not periodic, then there are at most two alignments that the algorithm needs 
to consider: informally, the period in the blocks can be "broken" either to the left or to the 
right of the pieces. To specify these two possibilities more clearly, we introduce the following 
notation. Let s = [a, b] be an interval in a string x such that s has period tt. Then, we denote 
by lbreak(s) the rightmost marker in x such that [lbreak('S), b] does not have period tt. Similarly, 
let rbreak(s) be the leftmost marker in x such that [a, rt, r eak(s)] does not have period tt. If the 
blocks are periodic, there may be too many possible alignments, and the alignment between the 
pieces will be fixed at a later point (when f3 becomes smaller than the period). However, the 
algorithm will use the "knowledge" that the blocks are periodic in the frames procedure. 

We now show that split is correct if the input constraint can be satisfied and that it 
discovers all /3-critical blocks. 

Lemma 1. Let C be the constraint at the beginning of split, and let V be a size-k CSP satis- 
fying C such that all blocks of length at least 2/3 ofV are discovered by C. Then, split creates 
at least one branch whose constraint C 

— is satisfied by V , and 

— all blocks of length at least f3 are discovered by C 

Proof (of Lemma [7]). Let B = {(x 1 , y 1 ), . . . , (x e , y e )} be the uniquely defined set of matched 
pairs of undiscovered blocks in V that are /3-critical. 

Consider the following branching for Lines 5-6 for each piece p £ N: If p is contained in 
some block x % or y l of B, then branch into the case that p is added to S. Otherwise, branch 
into the case that p is added to F (note that we may add in F some pieces that do not contain 
any breakpoint, but are contained in blocks not in B). 

Now consider the constraint obtained for the above branching after the merging operations 
performed in Lines 7-9. We show that V satisfies Conditions 1 and 2 of this constraint. First, 
consider a breakpoint in V . This breakpoint is contained in some fragile piece / of the input 
constraint since V satisfies this input constraint. Hence, it is contained in some new piece p of 
the splitting of this fragile piece. Clearly, the piece p is added to F in the considered branching. 
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Moreover, in case Lines 7-9 merge fragile pieces, the resulting piece is also fragile, hence p 
remains in a fragile piece. Consequently, all breakpoints of V are in fragile pieces of F, and thus 
Condition 1 is satisfied by V. 

Now consider a fragile piece / G F after Lines 7-9 of the algorithm. We show that / 
contains at least one breakpoint. Note that / is obtained after a (possibly empty) series of 
merging operations. After the merging, / is between two solid pieces. If / is also a fragile piece 
in the input constraint, then / contains a breakpoint since V satisfies the input constraint. 
Otherwise, / is contained in a fragile piece of the input constraint, and at least one of its 
neighbor pieces is a new solid piece s. Since / (or all the smaller pieces that were merged to /) 
are added to F by the branching, they are not contained in the block that contains s. Hence, / 
contains the breakpoint between the first (or last) marker of the block containing the new solid 
piece and its predecessor (or successor). Thus, Condition 2 is also satisfied by V. 

Note that the above also implies that, for each x l of B, there is exactly one new solid piece 
that is contained in x l . Similarly, for each y % of B, there is exactly one new solid piece that is 
contained in y l . Note that in this branching, \S X \ = \S y \ and furthermore, since V has size k, 
\F X \ < k and \F y \ < k. Hence, the algorithm does not abort in Lines 10 and 11. We now consider 
the branching in which for each pair (x'\y l ), the two corresponding solid pieces are matched 
to each other. Clearly, this branching fulfills Condition 3: the condition holds obviously for all 
pieces contained in blocks of B. Furthermore, it holds for all old solid pieces since for these, the 
matching M has not changed. Note that the function M also remains a bijection: it is changed 
only for unmatched solid pieces, and the number of new solid pieces in x and y is equal. 

It remains to show that there is a branching in which Conditions 4 and 5 also hold. Consider a 
pair of matched solid pieces s and s' , and the blocks x l , y % containing them. We use the following 
technical claim in order to clarify the discussion; it will be proven afterwards. 

Fact. If there are more than six alignments of s and s' whose shift have an absolute value 
of at most 173/3], then 

i. s and s' are periodic with the same shortest period n (with ||7r|| < |~/3/3]/2); 

ii. if the blocks x % and y % do not have period n, then in V either lbreak(s) is matched 
to lbrcak(s')) or r brcak(s) is matched to r break (s') (or both). 

Let a be the leftmost marker of s and a be the marker matched to a in V. Then {a, a} is an 
alignment for (s, s') whose shift has an absolute value less than [73/3]: there are at most [73/3] — 1 
markers preceding either s or s' that can belong to the same block since the pieces of the 
[73/3] -splitting preceding s and s' are fragile and thus not contained in the same blocks. If 
the condition of Line 17 is satisfied, then there is one branch where alignment {a, a'} is added 
to Rs- Otherwise, by the fact above, the following cases are possible. Either {a, a'} is one of the 
alignments where lbreak(s) is matched to lbreak(s') or rb r eak( s ) is matched to rb rca k(s / )i i n which 
cases {a, a'} is added to Rs in one of the branches. Otherwise, (s, s') is not fixed, and s and s' 
are contained in blocks having the same shortest periods. 

Altogether this shows the first claim of the lemma. The second claim can be seen as follows. 
The blocks of length £ > 2/3 are already discovered, and the corresponding solid pieces remain in 
the constraint. It thus remains to consider the /3-critical blocks. We show that for each x % there 
is at least one piece that is contained in x l . Consider the marker a at position |~/3/3] in x % and a 
piece s of the [73/3] -splitting that contains this marker. Then s contains only markers from x l 
since s has length at most [73/3] and x l has length at least /3 > 2|~/3/3] (for /3 > 4). Afterwards, s 
is only merged with other pieces that are contained in x % (recall that in the considered branching 
there is a fragile piece between all solid pieces from different blocks). Hence, the second claim 
of the lemma also holds. 

It remains to show the correctness of the claimed fact. We first need to prove the following 
claim. Define the [73/3] -middle of an interval [u, v] as the length- 1~/3/3] interval centered in [u, v] 
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(formally, the interval [u,v] with u = u\> [(uv — [~/3/3])/2j and v = v< [(uv — [~/3/3])/2]). Then 
s contains the [73/3] -middle of x l and s' contains the [73/3] -middle of y l . 

The claim is shown for s = [a, b] (the proof for s' is similar). Write x % = [u, v], and [ii, v] the 
173/3] -middle of x l . First note that since x l has length at least (3, we have uv > f3 — 1. We show 
that a is in the interval [u,u]: 

u~u~ = [(m- [/3/3])/2j 

> L(/3 - 1 - [/3/3D/2J 

> LCL2/3/3J — 1)/2J 

> L(2/3/3 - 1.7)/2J 

> [fi/3 - 0.85J 

> T/3/31 - 2. 

Since the piece with right endpoint a in the [73/3] -splitting is fragile (it has not been merged 
with s), it contains a breakpoint of V and hence a marker strictly to the left of u. Moreover 
it has length at most |~/3/3], so to < f /3 / 3~| — 2, which implies that a is in the interval [u, u\. 
Similarly, b is in the interval [v,v], and [a, b] contains the [~/3/3] -middle of x l . 
We can now turn to proving the two statements of the fact. 

(i) Let s = [a, b], s' = [a', b'] and <5i, 82, ■ ■ ■ , S m be the shifts of the m > 7 alignments such 
that — 173/3] < Si < 62 < ■ ■ ■ < 5 m < Write i the index such that 5i + \ — 5i is minimal, 
and p = 5i + \ — 5i. We thus have 

m — 1 

< T/3/31/2 

Recall also that both s and s' have length at least 2\/3/3] — 1. Let q be an integer with p < q < ab. 
Using the second condition in the definition of alignment, we have 

a\> q = a! \> (5i + q) (since a> q £ [a,b]) 
= a!> (<5 i+ i + q-p) 

= a> (q — p) (since a > (q — p) £ [a, b]) 

Thus intervals [0,6] and (symmetrically) [a', b'] are both periodic with period length p: the 
shortest periods of s and s' have length at most [/3/3]/2. 

Using the fact that s and s' both contain the [73/3] -middle of the matched blocks in which 
they are contained, they have a common substring of length greater than twice their shortest 
periods. They thus have the same shortest period. 

(ii) Recall that x l (y l ) is the block containing s (s r ) in V. Write [u,v] ([u',v']) the [73/3]- 
middle of x % (y l ). Since [u, v] C s and \\[u, v]\\ > ||7r||, we have that lbreak(-5) is the right- 
most marker in x and lbreak( s ) is the leftmost marker in x such that intervals [lbreak( s )> v ] an d 
[u, rbrcak(-5)] do not have period ir. We have the similar property for lbreak^') (rbreak(s')) an d v' 
('><')■ 

Since x % contains [u,v], then either Xi has period ir, either it contains lbreak( s ) or rb rca k( s )- 
Suppose that x % contains lbreak(s) (the case where x l contains rb re ak(s) is similar). Write I' the 
marker in y % matched to lbreak( s ) by V . Then [l',v'] = [lbreak(^), v] does not have period tt, and 
for all m! G [V > l,v'], [m',v'] has period tt. Thus, I' is the rightmost marker such that [l r ,v'] 
does not have period tt, and I' = lbreak('S / ) : markers lbreak(s) and Ibreak(s') are matched in V. □ 

The following trivial observation follows from the check in Line 11 of split. It is useful for 
bounding the running time of split (in particular for later calls to split). 
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Observation 1 After split has finished, the constraint contains at most 2k — 2 fragile pieces 
from each of x and y. The overall number of solid pieces is thus at most 2k. 

To obtain a fixed-parameter algorithm for parameter k, we now "shrink" the fragile pieces 
between the solid pieces of the constraint. This will ensure that in the next call to split, the 
number of new pieces created in the splitting will be bounded by a function of k. Note that by 
Lemma [TJ split has discovered all pieces that have length at least j3. Hence, we now update 
the value /3 denoting the approximate length of the longest short blocks (by taking the largest 
remaining value from 77). Then, frames uses this updated value of j3 to shrink the fragile pieces. 
For the moment, we make some claims about frames; their proof is deferred to the Sections [4] 
and[5j First, we claim that frames is correct, that is, there is at least one good branching for 
yes-instances. 

Lemma 2. If there exists a size-k CSP V satisfying C at the beginning of frames such that 
longest undiscovered block is /3- critical, then frames creates at least one branch such that the 
constraint in this branch is satisfied by a size-k CSP V' whose longest undiscovered block has 
length at most 2(3 — 1. 

Second, frames increases the exponential part of the running time by a factor that depends 
only on k. 

Lemma 3. Overall, the calls to frames create (2/c) 4fc2 • branches; all other parts of the 

algorithm can be performed in poly(n) time. 

Finally, to bound the number of branches in the subsequent call to split, and for the 
case f3 < 4, we use the following lemma. 

Lemma 4. When frames terminates, every fragile piece has length at most 12(k 2 + k)k(3. 

Note that the above also holds before the first call of split. Using these lemmas, we obtain 
our main result. 

Theorem 1. Minimum Common String Partition can be solved in fc 21fc2 poly(n) time; it 
is thus fixed-parameter tractable with respect to the partition size k. 

Proof (of Theorem^). For the correctness proof assume that the instance is a yes-instance 
(for a no-instance the algorithm can always check the correctness and size of a CSP before 
returning, thus it has empty output for no-instances). Then, assuming that the input strings 
are not identical, there is a CSP V satisfying the initial constraint C which demands only that 
there is at least one breakpoint in x and in y. 

We now show that there is a set H' of powers of 2, all of which are smaller than n such that 
the algorithm outputs, in at least one of its branches, a size-A: CSP, in case the main algorithm 
loop is traversed for this set II' . 

Let /3 be the smallest integer such that there is a size-k CSP in which the longest block is 
/3-critical. Then, the largest integer of II' is f3. Now, if j3 < 4 the algorithm directly finds all 
breakpoints by a brute-force branching. Otherwise, the procedure split is called. By Lemma[TJ 
this procedure creates at least one branch where the constraint is satisfied by some size-k CSP V 
and all its blocks of length at least (5 are discovered by C. Consider an arbitrary branch with 
this property. Now, let f3 denote the smallest power of 2 such that there is a CSP satisfying 
the current constraint C in which the longest blocks are /3-critical. This integer /3 is the second 
largest integer of II' . The algorithm now calls frames and by Lemma [2] obtains in at least 
one branch a constraint such that there is a size-A; CSP that satisfies the constraint in this 
branch. Furthermore, also by Lemma [2] the longest undiscovered block in this CSP has length 
at most 2/3 — 1. By the choice of f3, it follows that the longest undiscovered block of this 
CSP is /3-critical. Now, the algorithm either finds all breakpoints by brute-force (if f3 < 4) 
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or again calls the procedure split to discover all /3-critical blocks. This whole procedure is 
repeated for smaller and smaller /3, each time (5 is defined as the smallest power of two such 
that there is a size-A; CSP satisfying the current constraint C whose longest undiscovered block 
is /3-critical. The set II' contains exactly all integers obtained this way. Eventually, f3 < 4 and 
the algorithm branches by brute-force into all cases to set the breakpoints without violating the 
current constraint. Clearly, one of these cases is equivalent to a CSP satisfying this constraint. 
The algorithm verifies that this is indeed a CSP and that it has size at most k and correctly 
outputs the CSP. Hence, the algorithm is correct. 

It remains to show the running time of the algorithm. First, the for-each-loop in the main 
method is executed 0(2 logn ) = 0(n) times. Second, by the restriction on II', the repeatdoop in 
the main method is executed at most k times. To obtain the claimed running time, we bound 
the number of branches created in each call to split. 

In each call to split the total length of the fragile pieces is less than (2A;)12(fc 2 + k)kfi = 
24(& 4 + A; 3 )/3: In the first call, /3 > n/2k, so the bound holds. In the other cases, there are, by 
Observation [l] at most 2k — 2 fragile pieces in x and y. Furthermore, in this case split is called 
after frames. Thus, by Lemma [4j each fragile piece has length at most 12(A; 2 + k)k(3, and the 
overall bound follows. 

The procedure splits the fragile pieces into new pieces of length at most [/3/3] (i.e. there is 
a distance |~/3/3] — 1 between the left endpoints of two consecutive pieces of the same splitting) . 
Since /3 > 4, we have |~/3/3] — 1 > /3/6. Hence, this creates less than 144(/c 4 + k s ) new pieces of 
length |~/3/3] plus at most one additional shorter piece at the end of each fragile piece. Hence, 
145& 4 is an upper bound on the number of new pieces. Branching for each piece into the case 
that it is solid or fragile can be done in 2 145fc branches. The number of necessary branches for 
this part of split can be reduced as follows: Since we merge series of consecutive pieces in F or 
S, and since we do not need to consider branches with more than k solid pieces, we can directly 
look for the first and last piece of each /3-critical block. This creates 0(( 14 4 5 /f )) = 0( — ) 
branches in each call of split. 

The matching requires up to k\ branches, and the alignment at most 6 . Since 145 4fc fc!6 fc = 
o((4/c)!), we can bound the number of branches in each call of split by k 1Gk . The split pro- 
cedure is called at most k times (by the restriction on II), thus creating 0((fc 16fc ) fc ) = 0(fc 16fc2 ) 
branches throughout the algorithm. Finally, the number of branches created in frames is (2fc) 4fc2 - 
ip(k) ^ Lemma [3J and the number of branches created in the final brute- force can be bounded 
as follows. The length of the fragile pieces is 0(/c 4 + A; 3 ) and we need to guess at most 2k — 2 
precise breakpoint positions from this number. This can be done with kP^ branches. 

Finally, note that all other steps of the algorithm can be clearly performed in polynomial 
time. Altogether, the total running time of the algorithm thus is 

0{k 2k n) ■ k 1Gh2 ■ . (2k) 4k2 ■ k°^ ■ poly(n) = k 21k * poly(n). 

□ 

4 Putting Frames Next to Fixed Pieces 

In this and the next section, we prove the two claimed lemmas concerning frames. Informally, 
we show that, with the right constraint in the beginning, frames finds a constraint C that 
is satisfied by a size-k CSP V whose longest undiscovered block has length at most 2/3 — 1. 
Moreover, the length of each fragile piece is 0(& 3 /3) in every constraint produced by frames. 
The pseudo code of frames is shown in Algorithm [3j 

The approach of frames is to use a set of reduction rules to put "frames" into the fragile 
pieces, where a frame is an interval within the fragile piece that contains all breakpoints that 
are contained in this piece. We call the actual shortest interval containing all breakpoints of a 
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Algorithm 3 Procedure frames. Global variables: C, (3. 

1 w :— 2/3k + 1 // upper bound on window length 

2 repeat : 

3 Compute the maximum extension of each solid piece, 

the piece graph G[C, $ := 0], and the strips of each rep— rep path 

4 while there is a frameless fragile piece : 

5 place frames in fragile pieces by applying Frame Rules 1-6 

6 for each fragile piece : apply Fitting Rule 1 

7 new-align := False / / Fix pieces with long periods: 

8 for each repetitive solid piece s (with period 7r s ) : 

9 if all fragile pieces adjacent to s or s' have length at most 6(fc 2 + k) ||7T S || : 

10 for each feasible alignment branch into the case to add this alignment to R$ 

11 new-align <;— True 

12 until new-align = False 

13 return the modified constraint C 



fragile piece a "window", denned as follows. Let V be a size-A; CSP satisfying C, and let / be 
a fragile piece in C. The window of / is the interval [a, b] such that {a, a > 1} is the leftmost 
breakpoint of V in / and {b < 1, 6} is the rightmost breakpoint of /. Since a frame is required 
to contain all breakpoints of a fragile piece it can be seen as a "super" -approximation of the 
actual window. A formal definition of frames is as follows. 

Definition 6. Let C be a constraint. A frame [a, b] for a fragile piece f of C is an interval that 
is contained in f . A frame set for C is a set <P of frames such that each fragile piece f contains 
at most one frame. A CSP V that satisfies C satisfies a frame set <P for C if each breakpoint 
ofV is contained in some frame of<P. 

The approach to add the frames to the constraint can be summarized as follows: first, we 
compute an upper bound w on the length of the windows that only depends on (3 and k. Then, 
we apply a series of frame rules that eventually place a frame in all fragile pieces (Lines 4-5). As 
we will show, the frame length then depends on w (and thus on k and f3) and on the maximum 
period length of the unfixed (repetitive) solid pieces. Since the frames contain all breakpoints 
of V, it is possible to reduce fragile pieces until they "fit" their frames (Line 6). We now check 
whether there are some unfixed solid pieces with a long period compared to w. If this is the case, 
then the number of "feasible" alignments for these pieces is small, and we can thus branch how 
to align these pieces (Lines 7-11). Formally, we call an alignment of s = [a,b] and s' = [a',b'] 
feasible for C if the interval equidistant to [a, b] ([a',b']) from s does not intersect any other 
solid piece than s' (s) in C. Note that each satisfying CSP can only have feasible alignments, 
otherwise there is at least one fragile piece without breakpoints. 

Afterwards, we go back to applying the frame rules (we will obtain shorter frames since 
the number of fixed pieces has increased). If this is not the case, that is, all periods are short 
compared to w, then we show that the maximum frame length depends only on w. Hence, in this 
case they are sufficiently short, and the frames procedure has achieved its goal. The algorithm 
thus returns to the main method where it calls split to find new solid pieces. 

In this section, we describe the frame rules that place frames in fragile pieces which are next 
to fixed pieces and some further simple frame rules. Before doing so, we define two concepts that 
will be used by the frame rules: maximum extensions and the piece graph. Roughly speaking, 
maximum extensions are used locally to bound the position of some breakpoints in the fragile 
pieces. The piece graph provides a structural view of the relationship between pieces and is used 
to show that one of the frame rules can always be applied in case there is a frameless fragile 
piece. 

Maximum extension of solid pieces. Let s be a solid piece in a constraint C. The maximum 
extension of s is the interval [l ex t(s), r ex t(s)] containing s where r ext (s) and l ex t(s) are defined as 
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Fig. 3. Left: two pairs of fixed solid pieces (s, s') and (t, t'). Reference markers are shown with an asterisk, 
maximum extensions are delimited with dashed lines, and the breakpoints of some possible CSP are marked with 
red crosses. Right: a simplified representation of the same pieces, where thick (resp. thin) lines are used for solid 
(resp. fragile) pieces. 



follows. If s is fixed, then let £ be the largest integer such that [s* , s* >£] = [s'* , s'* >£}, and that 
no marker of [s* , s* >£] or [s'*, s'* >£] is in a solid piece other than s or s' . Then r ex t(.s) := s* >£ 
and r ex t(V) := s'* >£. If s is repetitive with shortest period ir s , then let a be the leftmost marker 
in s and define r ex t(s) as the rightmost marker such that the interval [a, r ext (s)] has period tt s , 
and that no marker in [a,r ex t( s )] is in a solid piece other than s. Marker l ex t( s ) is obtained 
symmetrically. 

The following proposition is a straightforward consequence of the definition of maximum 
extension. 

Proposition 1. Let s be a fixed solid piece, and let [a,b] and [c,d] be two intervals that are 
equidistant from s and such that [a,b] is contained in [l ex t(s), r ex t(s)] and [c,d] is contained 
in [lcxt(s') ; r ext (s')]. Then, [a,b] = [c,d]. 

Note that, as a special case, the above proposition includes single markers (that is, length-one 
intervals). The next proposition simply states formally that the maximum extensions of a solid 
piece contain the block which contains the solid piece. 

Proposition 2. Let C be a constraint and s be a solid piece of C. Any CSP that satisfies C 
has a block which contains s and is contained in [l ex t(s), r ext (s)] . Furthermore, let f be a fragile 
piece next to s. Then, the window in f contains at least one marker of [l ex t(s), r ex t(s)]. 

Proof ( of Proposition^ . If s is fixed, then the block containing s cannot contain the marker l ex t(s)<i 
1: if this marker is contained in the block, it is matched to lext^') < 1- By definition of l ex t(s), 
either l ex t(s) < 1 ^ lext(s') < 1 or one of l ex t(s) < 1, lext(s') < 1 belongs to a different solid piece t. 
In the first case, we do not obtain a CSP; in the second case, there is at least one fragile piece 
without a breakpoint. Similarly, the block containing s cannot contain r ex t(s) > 1. 

Every repetitive solid piece s is contained in a block which is periodic with the same shortest 
period it as s. By definition of l ex t() and r ex t() for repetitive solid pieces this block must thus 
be contained in [l cx t(s), r ext (s)]. 

Finally, consider a fragile piece / to the right (to the left) of s. The window in / contains the 
last (first) marker of the block containing s. By the above it thus contains at least one marker 
of [l ext (s),r cxt (s)]. □ 



The Piece Graph. Given a constraint C and a frame set <L>, the piece graph G[C, is the bipartite 
graph G := (Vs UVf,E) constructed as follows. 

— Vf contains one vertex vj for each frameless fragile piece / G F, 

— Vs contains, for each repetitive solid piece s £ S x a vertex v s , and for each fixed piece s G S x , 
two vertices l s and r s (for left and right). 

— For a fixed solid piece s and a fragile piece / G F, G contains the edge {vf,l s } if the last 
marker of / is the first marker of s or of s' , and the edge {vf,r s } if the first marker of / is 
the last marker of s or of s' . 
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Fig. 4. Frame Rules 1-6 of frames. Frames are drawn as red boxes, the frame created at each step is dashed. 
Possible breakpoint positions in V are shown as red crosses. 



— For an unfixed solid piece s, G contains the edge {vf, v s } if the first marker of / is the last 
marker of either s or s' or if the last marker of / is the first marker of either s or s' . 

Note that the vertices v s or l s and r s are only defined for pieces s G S x , but they represent 
both pieces s and s'. Observe furthermore that in case Vf ^ 0, there are fragile pieces in C 
that do not have a frame in Moreover, note that in this case the edge set of the piece graph 
is nonempty. Our aim will thus be to gradually apply the frame rules until the piece graph is 
edge-less. Each vertex is called fragile, fixed or repetitive depending on the nature of the piece 
it represents. Note that most vertices of the graph have degree at most 2, except for repetitive 
vertices which can have degree up to 4. Vertices with smaller degree correspond initially to the 
four pieces at the end of the sequences. 

In order to deal seamlessly with pieces at the end of the input strings (where no fragile 
piece is adjacent on one side), we introduce "phantom frames" as follows. If s contains the 
first element of a string, i.e. x[l] or y[l], we say that s has the phantom frame [x[0], (resp. 
[y[0],y[l]]) to its left. Likewise, if s contains x[n] or y[n], it has the phantom frame [x[n],x[n + l]] 
(resp. [y[n],y[n + 1]]) to its right. 

We now have collected the prerequisites to state the frame rules. A frame rule is an algorithm 
that receives as input a constraint C and a frame set <P and updates both into a constraint C 
and a frame set A frame rule is correct if following holds. First, if there is a size-A; CSP V 
satisfying C and <P, then there is also a size-fc CSP V' satisfying C and <P'. Second, the longest 
undiscovered block in V is at most as long as the longest undiscovered block in V (this additional 
restriction will be used to argue that the choice of f3 remains correct). Note that without loss 
of generality, we describe all rules for pieces in x but they apply to fragile pieces in x and y. 
Furthermore, if a rule works on a single fixed vertex in the piece graph, then we assume that 
this vertex is a left vertex l s (by inverting the instance one can also deal with all right vertices). 
Finally, we state the additional frames of all rules by defining an interval which contains the 
window, in order to ensure that the frames are within the fragile pieces, we always intersect this 
interval with the considered fragile piece /. The first rule puts frames into fragile pieces at the 
end of the string. 

Frame Rule 1. If the piece graph contains a fragile degree- one vertex Vf, then f contains either 
x[l] or x[n]. Iff contains x[l] add /H [x[l], x[l] >w] to<I>, otherwise add f(~) [x[n] <w, x[n]\ to^>. 
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Proof (of the correctness of Frame Rule\lj). Fragile pieces of x that do not contain the first or 
the last marker of x are preceded and followed by a solid piece (since the splitting is alternating) 
and thus the corresponding vertex in the piece graph has degree two. Hence, a fragile piece in x 
corresponding to a degree-one vertex of the piece graph contains either the first or the last 
marker of x. Assume without loss of generality that / contains x[l]. The leftmost block of V 
in x is necessarily a short block since it is contained in the fragile piece /. Hence, marker x[l] 
belongs to the first short block of V and it is next to a breakpoint of V . Since the window 
(which contains all breakpoints in /) has length at most w, it is contained in the created frame 
[x[l],x[l] >w]. □ 

Frame Rule 2. If the piece graph contains a degree- one vertex l s with neighbor vt such that f 
is next to s and s' does not contain y[l], then: let [s *<u, s'* <v] be the (possibly phantom) frame 
to the left of s' in y; add the frame f Pi [s* < (u + w — 1)), s* < v] to <P. 

Proof (of the correctness of Frame Rule^). Consider first the case where [s'* < u, s'* <v] is a 
phantom frame: in this case, s'* < v is y[l] and u = v + 1. Hence, y[l] is the first element of the 
block containing s' . Since y[l] and s* < v are equidistant from s, s* < v is the first element of 
the block containing s and the last element of the window in /. Since the window has length at 
most w — 1, it is contained in the frame [s* < (v + w), s* < v] = [s* < (u + w — 1)), s* < v]. 

Consider now the (regular) case where s' has a fragile piece g to its left. By the frame 
definition, all breakpoints of a satisfying CSP V that are in g are within [s'* <u, s'* <v]. Hence, 
s'* <v is in the same block as s' . Consequently, the right limit of the window in / is to the left of 
s* <v in /. Similarly, s'* <u is in a different block than s' and thus there is a breakpoint to the 
right of s* <u in /. All other breakpoints in / can have distance at most w from this breakpoint. 
Hence, all breakpoints in / are contained in the created frame [s* < (u + w — 1), s* < v\. □ 

The above rules are relatively straightforward inferences of frame positions that can be made 
because the piece graph has degree-one vertices. We now show some more intricate rules that 
deal with the remaining cases. In particular, we show how to deal with cycles in the piece graph. 
We first consider cycles without repetitive solid pieces. Note that the following rule performs 
a branching. We thus extend the correctness notion to hold if there is at least one branch in 
which the created constraint and frame set can be satisfied. 

Frame Rule 3. If the piece graph contains a simple cycle without repetitive vertices, then create 
one branch for each edge {vf, u s } of this cycle. In each branch, add to <P the frame 

— f H [lext(s) < w, lext(s) > (2u>)] if u s = l s for some solid piece s, or 

— f H [r ox t(s) < 2w, r cxt (s) > (w)] to f if u s = r s for some solid piece s. 

The following is a straightforward property of constraints and satisfying solutions and used 
for showing the correctness of Frame Rule [3j 

Proposition 3. Lets be a fixed solid piece in a constraint C. If markers a and a' are equidistant 
from s, then for any integer i, a > i and a' > i are equidistant from s. Moreover, given a CSP 
V satisfying C, the first markers (the last markers) of the blocks of V containing s and s' are 
equidistant from s . 

Proof. The first part is directly obtained by definition: 

s*(a > i) = s*a + i = s'*a' + i = s'*(a' > i). 

For the second part, simply note that if s* is at position j in the block containing s, then s'* is 
also at position j in s' . Hence, the first markers (and thus also the last markers) of both blocks 
are equidistant from s. □ 
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Proof (of the correctness of Frame Rule\^). Let *p be the set of CSPs that satisfy the constraint 
C and frame set & and additionally have a minimum total length of short blocks. We show that 
there is a V E which has all breakpoints in [l ex t(s) < w, l e xt (s) > (2io)] for some vertex l s of the 
cycle, thus showing correctness of the rule. 

Since the piece graph G[C,^] is bipartite with partition Vs and Vp, the cycle alternates 
between vertices of Vs and Vp. Moreover, all cycle vertices from Vs are fixed, and alternate 
between left and right vertices (each fragile vertex of the cycle is adjacent to a left vertex and to 
a right vertex). Hence there exist solid pieces s\, S2, ■ ■ ■ , S£ and fragile pieces fi, f2, ■ ■ ■ , fi such 
that the cycle is (l Sl ,Vf 1 ,r S2 ,Vf 2 , . . . ,l Se _ 1 ,Vf e _ 1 ,r Se ,Vf e ). For simplicity, we consider indices only 
modulo £ (that is, si + \ = s%, fo = ft, etc.), and we assume that fragile pieces with odd indices 
are in x and those with even indices are in y. Consider a CSP V G ^3 such that there is no l s 
whose window is contained in [l ex t(s) < wAext(s) > (2u>)]. We transform this CSP into one that 
fulfills this property. We first prove that in V either all fragile pieces with odd or all fragile 
pieces with even indices contain only one breakpoint. Assume towards a contradiction, that 
there exist integers i < j of different parity such that fi and fj both have windows with at 
least two breakpoints and for each h with i < h < j , fh contains only one breakpoint. Assume 
without loss of generality that i is odd and j is even. Hence, fi is in x to the right of Sj+i and 
fj is in y to the right of Sj. 

For all h, i < h < j, let ah be the leftmost marker of the window in fh, and bh = ah> 1. For 
odd h, ah and ah+i are the rightmost markers of the blocks containing Sh+i and s' h+1 and thus 
equidistant from Sh+i - For even h < j, bh and bh+i are the left endpoints of the blocks containing 
Sh+i and s' h+1 , so they are equidistant from Sh+i- By Proposition [3J for all i < h < j, [ah, bh] and 
[ah+i, bh+i] are equidistant from Sh+i - By definition of ah, the window in each fh is contained in 
[ah, ah > uu]. If one of these intervals is not contained in the maximum extension of an adjacent 
solid piece, say [a/j, > w] is not contained in the maximum extension of Sh+i, then l e xt(sfo+i) 
is contained in [ah, ah > w]. Hence, the window is contained in [l ext (sh+i) < w,l ex t(sh+i) > w], 
contradicting our assumption on V. In the following, we thus assume that all intervals [ah, a,h>w] 
are contained in the maximum extension of adjacent solid pieces, which by Proposition [l] implies 
that they all have the same content. In particular, this implies [oj, Oj > w] = [a,j, aj > uu]. 

We now describe a modification of V that results in a new CSP which is not larger than V , 
also satisfies the constraint and frame set but has smaller total length of short blocks; the 
modification is illustrated in Figure [5] Let u + 1 and v + 1 be the lengths of the leftmost short 
blocks in fi and fj respectively (assume without loss of generality that u < v). These two short 
blocks are thus [bi, bi>u] and [bj, bj>v], and they are matched in V to other short blocks [b' { , b[>u] 
and [bp b'j > v]. Note that since fi is odd and fj is even, [bi, bi > u] is in a different string than 
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Fig. 6. Illustration for the correctness proof of the second part of Frame Rule [3] Given a cycle with total length 
of short blocks p > we construct intervals [ah, bh] and [bh, ch] as shown (delimited by blue dotted lines). All the 
breakpoints in intervals [an, bh] can be shifted to the corresponding [bh,Ch]- 



[bj,bj>v]. To create the new solution T 3 ' from V apply the following modifications. First, cut out 
u + 1 markers from the left of [b'j, b'j >v] (recall that u < v ) which gives two new blocks [b'j, b'j >u] 
and [b'j > (u + 1), b'j >v] if u < v and leaves the block [b'j,bj>v] unmodified if u = v. Now, match 
block [ty, > u] to [b'j, b'j >u] (recall that these blocks are in different strings). Now, shift the 
breakpoints of the fragile pieces of the cycle as follows. For every odd h, i < h < j, cut out 
u + 1 markers from the left of s' h and Sh- And for every even h, i < h < j , add u + 1 markers 
to the right of the blocks containing Sh and s' h . Finally, in case u < v, match the shortened 
block bj > (it + 1), bj > v to the block b'j > (u + 1), b'j >v created in the first step. Note that by the 
discussion above, the pieces added to Sh and s' h for even h have the same content. Hence, all 
matched blocks have equal content. Furthermore, since the block [6j,6j >u\ is now unmatched, 
its markers are free to be added to Si + \. 

This new solution has at most as many blocks as V: we have created at most one new 
breakpoint in [b'j, b'j > v] and removed a breakpoint in /j by adding exactly u + 1 markers to 
the right of Sj+i. For all other fragile pieces fh, the breakpoint has "only" been shifted to the 
right. Furthermore, V 1 satisfies the same constraint C as V: the matching only changed between 
short blocks which are not constrained. Moreover, the fragile pieces for which the breakpoints 
have been modified are either frameless (if they are on the cycle) or the modification adds a 
breakpoint that is between two breakpoints (in the modification of [b'j, b'j>v]) However, the total 
length of the short blocks has been reduced by 2 (it + 1), which contradicts the choice of V . 
We now know that in V the short blocks of the cycle are either all in x or all in y. In the 
following, we assume they are all in x, that is, in fragile pieces fj with odd j. We now consider 
the following two cases: either there is no short block, even in x, or there is at least one. 

First consider the case that there is no short block in the cycle, that is, all the windows 
contain only one breakpoint [ah,bh\- If all markers bh are within the maximum extensions of 
both adjacent solid pieces, we create a new solution V' from V as follows: for every odd h, cut 
out bh and bh-i from the left end of the blocks containing Sh and s' h , and for every even h, 
add bh and bh-\ to the right end of the blocks containing s' h and Sh- The solution V' satisfies 
the same constraints as V, with the same total length of short blocks. Repeat this operation 
of shifting the breakpoints to the right until for some i (without loss of generality, assume i 
is even), b% is to the right of r ex t(sj). Then, the rule is correct, since for some branch the edge 
(r Si ,Vf i ) is selected and the frame [r ex t(si) < 2io, r ex t(sj) >w] which contains the only breakpoint 
of V in fi is added to <P. 

It remains to show the case where there is at least one short block in the fragile pieces of 
the cycle, that is, the total length p of the short blocks of the cycle is at least one. Note that 
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by the choice of w, p < w. We now show that the strings around the windows are periodic 
with period length p, so that we can again shift all the breakpoints of the fragile pieces to the 
right by steps of length p, until at least one of them has distance at most p from the end of a 
maximum extension. 

We first introduce some notations (see Figure [6] for an illustration): for each h, let [4,^] 
denote the window of fh- Let b\ = e±, a% = b\ <p, ci = 61 >p, and for each h, 2 < h < £, let a^, 
bh, and Ch be the markers equidistant with a>h-i, bfc-i, and Ch-i from Sh- 

We first show that for every h with 2 < h < £, we have 

e/A = dh-ibh-! - 1. (1) 

For even values of h, dh~\ and dh are equidistant from Sh, so dhbh = dh-\bh-x- Since fh is 
in string y, it contains only one breakpoint, and thus dheh = 1 and Equation ([!]) follows. For 
odd values of h, we have dh-\^h-\ = 1, and e^-i and are equidistant from Sh, thus ey^h = 
et-ibh-i, which also implies equation 0. Hence the distance between window endpoint eh and 
the marker bh increases, compared to the distance of eh-\ and bh-\ by the length of the short 
blocks contained in the window of fh-i- This has two implications: first, in fi , we have eebe = p 
and thus eg = (by definition a\ has distance p from bi, and this distance is conserved through 
the cycle). Second, for every j, the short blocks in fj are contained in [a,j,bj], and the window 
is contained in [aj < 1, bj]. 

First, consider the case where each interval [ah, Ch] is contained in the maximal extensions of 
both adjacent blocks. Thus, with Proposition [TJ we have [ah,bh] = [ai,&i] and [bh,Ch] = [b\,ci] 
for all h. We can now "close" the cycle: since eg and e\ are the left endpoints of the blocks 
containing s' x and si, they are aligned wrt. s\. Moreover, ei = ai and e\ = 61, so ai and b\ 
are equidistant from s±, which implies that [0^,6^] = [61, c\}. This now implies that, for all h, 
[dh,bh] = [bh,Ch\- We now create a solution V' from V as follows: for odd values of h, cut out 
the p leftmost markers from each block containing Sh or s' h . For even values of h, add p markers 
to the right of blocks containing Sh or s' h for even values of h. Match every short block that 
was matched to some [u, v] in some fh to [u > p, v > p] instead. The solution V' is again a CSP 
satisfying the same constraints, with the same total length of short blocks but with all the 
breakpoints in the cycle shifted to the right by p positions. Repeat this operation until for some 
h the interval [a^c/J is no longer contained in the maximal extension of the block to its right. 
Then, [a/^c/j] contains l ex t(s/i) 5 and interval [ah<l,bh] is contained in [l ex t (s/i) <2w, l ex t (-s/i) > w]. 
As argued above, the rule is correct if such a V £ ^3 exists. Note that the modifications made 
in the proof do not increase the length of any short block. Hence, the second requirement for 
correctness is also satisfied. □ 

The rules presented so far deal with fixed solid pieces. In fact, if all solid pieces are fixed, 
then these rules suffice to obtain frames in all fragile pieces. With the following three rules, we 
thus deal with the presence of repetitive solid pieces. 

5 Frame Rules for Repetitive Pieces 

In the rules, we have to deal with cycles in the piece graph that contain some repetitive vertices. 
We introduce the following concepts in order to analyze the structure of paths between repetitive 
vertices that contain fixed solid vertices. A rep— rep path (v s , Vf 1 ,u\,Vf 2 , 112, ■ ■ ■ , ug-i,Vf e , vt) is 
a simple path of the piece graph such that the two endpoints v s and vt are repetitive vertices, 
and each ui is a fixed solid vertex. Given a rep— rep path joining repetitive vertices v s ,vt and 
going through fragile vertices Vf x , vj 2 , . . . , Vf e , we define the strip of the path (see Figure [7]) as 
a set of intervals {/^ , If 2 , . . . , Ij e } such that: 

1. Consecutive intervals If z ,If. +1 are equidistant from the solid piece represented by itj. 

2. Each interval IV is contained in the maximum extensions of the two solid pieces next to fi. 
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Fig. 7. Example of a rep— rep path joining repetitive vertices v s and v t (with respective periods ab and ababc), 
and going through three fragile vertices and their adjacent fixed vertices. The strip of each fragile piece is delimited 
by the green dotted lines. 



3. The length of If x is maximal under Conditions [T] and [2j 

Proposition 4. All the strips in a rep— rep path have the same length and content. Each inter- 
val of the strip is contained in its respective fragile piece. Moreover, the strip of a rep— rep path 
is uniquely defined and computable in polynomial time. 

Proof (of Proposition^. The fact that strips have the same length and content is a direct 
consequence of Proposition [TJ which can be applied according to Conditions [T] and [2j Each strip 
must be contained in its fragile piece since it is in the intersection of the maximum extensions 
of the two adjacent solid pieces. 

The second part of the claim can be seen by considering the following algorithm to compute 
the strip. First, check whether the strip is nonempty. That is, try the following for each marker a\ 
in f\. Compute the marker ai in fi that is equidistant with a\ from u\. Then, compute the 
marker a 3 in f% that is equidistant with 02 from U2, and so on. If all a^s are in the maximum 
extensions of both solid pieces next to fi, then the strip is nonempty. Otherwise, the length 
of If x is zero. Now, assume the case that there was one a\ for which the above procedure is 
successful, that is, If x contains one or more markers. Then, set It := {a^} for each i. Now try 
to simultaneously expand all I fa's. That is, check whether one can add the marker to the left of 
each If. without violating Condition [2] of the strip definition. If this is the case, then add these 
markers to the I fa's. If this is not the case, then continue by adding markers to the right until 
this is also not possible anymore. The resulting set of I^'s is the strip of the rep— rep-path. □ 

Proposition 5. Let V be any solution satisfying constraint C such that the total length of all 
windows in V is p. In each fragile piece f of a rep— rep path ofC, writing If = [c, d], the window 
of f is contained in [c < p, d > p] . 

Proof (of Proposition^. We first introduce some notations: let fx, f%, ■ ■ ■ , fe be the fragile 
pieces of the path, and, for every 1 < j < £, let [a?, fy] denote the window of fj, If j = [cj,dj], 
ay = djaj and (3j = djbj. 

Hence we aim at showing that for all j, /3j < p, that is, bj is either to the left or at at most p 
markers to the right of dj. The proof for the left bound, that is, to show that cij is at most p 
markers to the left of Cj is symmetrical. 

By maximality of the strip length (Condition[3]), the intervals of the strip cannot be extended 
to the right. Condition [2] is the one constraining the strip length, hence there exists a fragile piece 
fj in the path such that this condition is tight, that is, dj = r ext (s), where s is the solid piece 
to the left of fj . Hence, aj is not to the right of dj , and thus ctj = dj aj = r ext (s)a : , < 0. 

Now for all j, f3j — ctj = \\[a,j,bj]\\ — 1, that is, it is the length of the window contained 
in fj minus one. Consequently, f3j < ||[oj ,&j ]||< Moreover, for every 1 < j < I, either the first 
markers of the window of fj and fj+\ are matched and thus equidistant to the piece represented 
by Ui or the last markers of the window of fj and fj+i are matched and thus equidistant to Uj. 
Hence, either ay = ay+i or f3j = fij+i- In the first case, fij+i increases, compared to f3j, by at 
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most || [a 3 -+i, bj+i] || — 1. Hence, j3j < f3j + p for all j > jo- By symmetry, the same holds for 
all j < jo- □ 

The following rule serves as a "preparation" of our main rule that deals with cycles containing 
repetitive vertices. It will ensure that if there is a cycle containing repetitive vertices, then these 
repetitive vertices will have the same period. 

Frame Rule 4. If the piece graph contains a rep— rep path between repetitive vertices v s and vt 
with strip {If x ,...,If ( } such that the strip If = [u, v] in f is shorter than the period ir s of s, 
then add the frame f n [u < w, v > w] to f. 

Proof (of the correctness of Frame Rule^. By definition, w is at least the total length of the 
windows of V . By Proposition [5j the endpoints of the window of / thus have distance at most w 
from If. □ 

Frame Rule 5. If Frame Rule does not apply and the piece graph contains a simple cycle 
with repetitive vertices, then do the following. Let \\ir\\ be the length of the period of any repetitive 
solid piece in the cycle. Then, create one branch for each edge {vf, u s } of the cycle where u s is 
a solid vertex for the solid piece s. In each branch, add to <P the frame 

— /n [r cxt (s) < (||7r|| +w),r ext (s) >w] if f is to the right of s, or 

— / n [lext(s) <iw,l ext (s) > (||7r|| +w)] if f is to the left of s. 

Proof (of the correctness of Frame Rule^ty. First, all repetitive pieces of the path have the 
same period. Indeed, consider any two consecutive repetitive pieces s and t of the cycle: they 
are linked by a rep— rep path, in which we compute the strips. All strips in this path have equal 
length S and also equal content (Proposition [4]) . Hence, the maximal extensions of repetitive 
pieces s and t have a common substring of length S. Since Frame Rule [4] does not apply, we 
have ||7r s || < S and ||7it|| < S. Thus, the maximum extensions of s and t contain a common 
substring longer than their respective periods. Consequently, their periods are equal, and thus 
all repetitive pieces of the cycle have the same period ir. 

Let si, S2, ■ ■ ■ , si denote the repetitive pieces crossed successively by the cycle (again, we 
write si + i = s\). For each i, 1 < i < £, let x^, x\, yf, y\ be the fragile pieces to the left and 
right of Si in x and s^ in y, respectively. For each rep— rep path of the cycle from Si to Sj+i, 
we say the path is positive if the first vertex after Sj is v x < or v y >, and negative otherwise. In 
positive rep— rep paths, fragile pieces in x are crossed from right to left (that is, the solid piece 
to the right of the fragile piece is "seen" before the solid piece to its left), and fragile pieces in 
y are crossed from left to right. Thus a positive path enters Sj+i via either v x > +i or v y -? +1 , and 
likewise a negative path enters Sj+i via either v x < +i or v y > +i . 

First, consider the case that all windows are contained within the strip and that both 
endpoints of the piece have distance at least ||7r|| to the borders of the strip. We show that in 
this case, we can shift all breakpoints in positive paths to the right by step ||7r|| positions and 
all breakpoints in negative paths to the left by ||7r|| positions. This is done as follows: 

— For each fixed vertex l s in a positive path, cut out ||7r|| markers from the left of the blocks 
containing s and s' . 

— For each fixed vertex r s in a positive path, add ||-7r|| markers from the right of the blocks 
containing s and s' . 

— For each fixed vertex l s in a negative path, add ||7r|| markers to the left of the blocks 
containing s and s' . 

— For each fixed vertex r s in a negative path, cut out ||7r|| markers from the right of the blocks 
containing s and s' . 

— Replace each short block [a, b] in a fragile piece of a positive path by [a > ||vr|| , b > ||vr||]. 

— Replace each short block [a, 6], in a fragile piece of a negative path by [a<||7r||,6<||7r||]. 
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— For a repetitive vertex v Si such that the paths before and after v Si enter and leave v Si via 
the same side (either and yf, or x\ and either both paths are positive or both paths 
are negative. Apply the same operation as if the piece was fixed: 

• If the path enters v Si via and leaves via yf, then cut out the ||7r|| leftmost markers 
of s and s' if the path is positive or add the ||7r|| markers to the left of s and s' if the 
path is negative. 

• If the path enters v Si via x\ and leaves via y%, then cut out the ||7r|| rightmost markers 
of s and s' if the path is negative or add the ||7r|| markers to the right of s and s' if the 
path is positive. 

— For a repetitive vertex v Si such that the paths enter and leave the vertex via the same string 
(either x^ and x\, or yf and y%) it holds that the paths have the same orientation. Apply a 
similar operation as for a short block (assume without loss of generality that the path enters 
and leaves via x): 

• If v Si is between two positive paths then replace the block [a, b] of x containing by 
[a > ||"7r|| , b > ||7r||]. 

• If v Si is between two negative paths then replace the block [a, b] of x containing Sj by 
[a < \\ir\\ , b < \\tt\\]- 

— For all other repetitive vertices, the paths enter from one string and leave via the other 
string and enter from one side and leave via the other side. Then the paths have opposite 
orientations; assume without loss of generality that the entering path is positive and the 
outgoing path is negative. Let [a, b] denote the block in x containing Sj, and let [a',b'] 
denote the block in y containing s^. 

• If the cycle enters from yf and leaves via x^, then replace [a, b] by [a,b< \\tt\\] and [a', b'] 
by [a' t> \\tt\\ , b'] (||vr|| markers are cut out of both blocks). 

• If the cycle enters from x\ and leaves via yf, then replace [a, b] by [a, b > \\tt\\] and [a' , b'] 
by [a' < | |*7r 1 1 , b'] (\\ir\\ markers are added to both blocks). 

Thus, all the breakpoints in fragile pieces have been shifted to the right (in positive paths) or to 
the left (in negative paths) by a period length ||*7r||. Hence, this modification still gives a partition 
of both strings. This partition has the same size as the original one. Furthermore, it is also a 
common string partition which can be seen as follows. The set of strings represented by the 
short blocks of x and y remains exactly the same since they were shifted by the period length. 
Hence, there is matching for the short blocks such that each short block is matched to one 
representing the same string. For the long blocks, the old matching remains a valid matching: 
The blocks containing fixed solid pieces have both been modified on the same side. Thus, they 
are either shortened by ||*7r|| markers; in this case, the matched blocks clearly represent equivalent 
strings. Or ||*7r|| markers have been added on one side. In this case, the matched strings are also 
equivalent, since the windows have distance at least ||tt|| to the borders of the strip. The blocks 
containing repetitive pieces have either been moved by ||-7r|| positions, shortened by ||*7r|| markers 
on the same side, ||*7r|| markers on the same side have been added, or they have been shortened 
or extended on different sides. In the first three cases, the strings represented by the new blocks 
remain equivalent for the same reasons as for the blocks containing fixed solid pieces. It remains 
to show the case in which blocks have been modified on different sides. 

First, consider the case in which [a,b] is replaced by [a,b< \\tt\\] and [a',b'] by [a' > \\ir\\ ,b']. 
Since the blocks are periodic with period length ||*7r|| we have [a' > ||*7r|| ,b'] = [a',tf <\\tt\\]. In the 
old solution, this subinterval of [a',b'] was matched with [a,b< \\tt\\], and thus [a' > \\ir\\ ,b'] = 
[a',b'< ||tt||] = [a,b< \\tt\\}. 

Now consider the case in which [a, b] is replaced by [a,b> \\tt\\] and [a', b'] by [a' < \\tt\\ , b']. 
Since the blocks are periodic with period length ||*7r|| we have [a', b'] = [a' < \\tt\\ ,b' < \\tt\\]. 
Since [a, b] = [a',b'] this implies that the first ||[a,6]|| markers of the new blocks are equivalent. 
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Also because of the periodicity, we have [b, b> \\tt\\] = [6<||7r|| , b]. Since [b< \\tt\\ , b] = [b' < \\ir\\ , b'], 
this implies that also the last ||7r|| markers of the new blocks are equivalent. 

Altogether, the modification gives a CSP of the same size, in which the distance between the 
window endpoints and the strip endpoints has decreased. The above operation can be repeated 
until at least one breakpoint is at distance less than ||-7r|| from the border of a strip. In this 
case, all breakpoints of the corresponding path are at distance at most w + \\ir\\ from the border 
of their corresponding strip (an argument similar to the proof of Proposition [5] applies). In 
some fragile piece /, the border of If coincides with the maximum extension of an adjacent 
solid piece s, thus, in /, the window is contained in either [l ex t(s) < (IMI + w;),l ex t('S) > w] or 
[r ex t(s)<w, r ext (s)i>(||7r|| Since in one of the considered branches, the rule adds the frame to 

this piece s and to the correct side of the strip interval it is correct. Note that the modifications 
made in the proof do not increase the length of any short block. Hence, the second requirement 
for correctness is also satisfied. □ 

The final case that needs to be considered is the one in which the piece graph is acyclic but 
none of the other rules applies. Then, the piece graph contains a repetitive degree-one vertex. 

Frame Rule 6. If the piece graph contains an edge {v s ,Vf} such that v s is repetitive and has 
degree one, then assume without loss of generality that f is to the right of s in x, and do the 
following. Let [ai,a r ], [bi,b r ], and [q,q] be the (possibly phantom) frames such that [ai,a r ] is to 
the left of s' in y, that [bi,b r ] is to the right of s' in y, and that [ci,c r ] is to the left of s in x. 
Add the frame f fl [//, f r ] to f , where fi := q > (a r bi + 1) and f r := c r > (a;6 r + w — 2). 

Proof (of the correctness of Frame Rule^. The window to the left and right of s' in y are 
contained in [ai, a r ] and [bi, b r ] respectively, and the window to the left of s in x is contained in 
[q,Ct]. Consider the blocks containing s and s' , and let t be their length. The two endpoints of 
the block containing s' are in [a/ > 1, a r ] and [bi,b r < 1]. Hence I > a r b\ and I < (a; > l)(b r < 1) = 
a\b r — 2. 

The leftmost marker of the block containing s is contained in [q > 1, c r \. Thus, the rightmost 
marker (the one in /) is necessarily in [q > (£ + 1), c r > (£)] which, by the above upper and lower 
bounds on £, is contained in [c; o (a r bi + 1) , c r > (aib r — 2)] . This marker is the leftmost marker of 
the window of / which has length at most w. Hence the frame [q > (a r b[ + 1), c r > (aib r + w — 2)] 
contains the window of /. The rule is still correct if s or s' corresponds to the end of a string, 
since the phantom frames contain the leftmost or rightmost marker of the blocks containing s 
or s' . □ 

After exhaustively applying the frame rules, parts of fragile pieces that are outside of frames 
do not contain a breakpoint. Hence, we perform the following rule which shrinks fragile pieces 
such that they fit their frame; at the same time, the solid pieces are extended accordingly. 

Fitting Rule 1 If there is a fragile piece f = [a, b] with frame [c, d] such that a ^ c or b ^ d, 
then add [a, c] to the solid piece left of f, add [d, b] to the solid piece right of f , and set f := [c, d]. 

Fitting Rule 1. 

I I 

I Y | 

Fig. 8. An illustration of Fitting Rule 1 of frames 

We now show two important properties of instances for which none of the frame rules 
applies. First, every fragile piece of these instances has a frame. Second, the frame lengths are 
upper-bounded by a function of k, [3, and the longest period of any repetitive piece. 
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Lemma 5. Let C be a constraint with frame set <p such that none of the Frame Rules 1-6 
applies. Then, each fragile piece has a frame, and all frames have length at most (6k 2 w + 3kw + 
3fcmax{u>, ||7r||}), where n denotes the length of the longest period of all repetitive solid pieces. 

Proof (of Lemma^). First, we show that every fragile piece has a frame. If the piece graph 
contains a cycle, then either Frame Rule[3j[4j or [5] applies. Otherwise, the piece graph is acyclic, 
and thus it either contains a degree-one vertex and one of the other Frame Rules applies, or all 
vertices have degree zero which means that all fragile pieces have frames. 

Next, we show the upper bound on the frame length. Let L be the length of the longest 
frame created in this procedure, and let n be the longest period over all repetitive pieces. We 
show that 

L < 6k 2 w + 3kw + 3k max{w, |M|} (2) 

Let h be the number of frames created before Frame Rule 6 is first applied, 1 < h < 2k. Rules 1, 
3, 4 and 5 produce frames of length at most (max{ro, ||7r||}+2u;). Since each application of Rule 2 
increases the maximum frame length by w, all frames have length at most (max{w, \\^\\} + (h + 
l)w) before the first application of Frame Rule 6. Note that once Rule 6 is applied for the first 
time, only Rules 2 and 6 can be applied. We introduce the following notations. A solid vertex 
(fixed or repetitive) is closed if all its adjacent fragile pieces have frames, and open otherwise. 
The weight of an open vertex is the total length of the frames in the adjacent fragile pieces. 
Let W denote the sum of the weights of all open vertices. 

Before the first application of Frame Rule 6, W < 3k[m&x{w, \\ir\\} + (h + l)u>] (for each 
solid piece s, the weight of either v s or l s and r s together is at most the sum of the weights of 
three different frames). Afterwards, each time Rule 2 or 6 is applied, an open vertex with some 
weight u is closed, and a frame of length u + w is created in a fragile piece / which is adjacent 
to at most one open vertex. Thus, the total weight of open vertices W is increased by at most 
u + w — u = w with each application of Frame Rule 2 or 6. They are applied at most 2k — h 
times, hence W is at most 

W <3k [maxjw, ||vr||} + (h + l)w] + (2k - h)w 
< 6k 2 w + 3kw + 3k max{w, |K||}- 

Since no frame of length more than W can be created, we have L < W, which proves the second 
part of the claim. □ 

The bound given by the lemma above still contains the maximum period length n which means 
that it is too large to be useful for the split procedure. However, the algorithm can now either 
find a repetitive piece which can be fixed with few options (see Lemma |6| or the maximum 
period length is not too long. 

Lemma 6. Let C be a constraint that contains a repetitive solid piece s with period ir s such 
that each fragile piece adjacent to s or s 1 has length at most 6(k 2 + k) \\tt s \\. Then, there are at 
most 12(k 2 + k) feasible alignments, and any CSP satisfying C matches elements of s according 
to a feasible alignment. 

Proof (of Lemma [fi]). The alignment corresponding to any CSP satisfying C is necessarily 
feasible, since otherwise two distinct solid pieces would be contained in the same block. 

Without loss of generality, let ||s|| > ||s'[|. Thus, in a satisfying CSP V, either the leftmost 
marker of s is matched to a marker left of s' (or to the leftmost marker of s'), either the 
rightmost marker of s is matched to a marker right of s' . Consider the first case; by Condition^ 
of satisfying CSPs, the leftmost marker of s is matched to some marker in the fragile piece to 
the left of s' . Note that since s and s' have a shortest period tt s , two different alignments are 
separated by a a multiple of ||7r s || markers. Hence, there are at most Q(k 2 +k) different alignments 
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in which the leftmost marker of s is matched to some marker of the fragile piece to the left of s'. 
Similarly, there are at most 6 (A; 2 + k) possible alignments in which the rightmost marker of s 
is matched to a marker of the fragile piece to the left of s' (for s = s', it is possible that both 
left and right endpoints of s are matched to markers of the fragile pieces to the left and right 
of s'). Overall, the total number of alignments between s and s' thus is at most 12(k 2 + k). □ 

By guessing the alignments of the long periods we have finally achieved the goal of frames: 
all frames are "short" enough to be split by split. 

Lemma 4. When frames terminates, every fragile piece has length at most 12(k 2 + k)kj3. 

Proof (of Lemma By Lemma [5j an instance in which no frame rule applies has frames 
of length at most (6k 2 w + 3kw + 3/cmax{u>, ||vr||}) where ir is the longest period among all 
repetitive pieces. In case ||7r|| > w, then for each repetitive piece s with period 7r, there are at 
most 6(k 2 + k) • ||7r|| possibilities to align s. Hence, at least one repetitive piece is fixed in the 
loop Lines 8-11, and new-align is set "True", which means that the outer loop in frames will 
be repeated. Otherwise, 7r < w and thus 6k 2 w + 3kw + 3k ||-7r|| < 6(k 2 + k)w < 12(k 2 + k)k(3. □ 

The correctness of frames is simply a consequence of the correctness of all single steps 
(always considering the correct branching in each branching step). 

Lemma 2. // there exists a size-k CSP V satisfying C at the beginning of frames such that 
longest undiscovered block is (3-critical, then frames creates at least one branch such that the 
constraint in this branch is satisfied by a size-k CSP V' whose longest undiscovered block has 
length at most 2(3 — 1. 

Proof (of Lemma^). The correctness of all frame rules have already been proven. The correct- 
ness of Fitting Rule 1 is trivial. Finally, the correctness of Lines 8-11 follows simply from the 
fact that the alignment in one of the branches is the correct one (it considers all feasible align- 
ments) . Since the correctness definition of the frame rules demands that all undiscovered blocks 
are at most as long as before adding the frame, also the size bound for the longest undiscovered 
block holds. 

It thus remains to bound the running time of frames. In particular, we need to show that 
the number of branches is bounded by a function of k. 

Lemma 3. Overall, the calls to frames create (2/c) 4fc2 • k°^ branches; all other parts of the 
algorithm can be performed in poly(n) time. 

Proof (of Lemma^). First, note that the outer repeat-until loop of frames is repeated at 
most 2k times over the course of all calls to frames: The procedure frames is called at most k 
times from the main method. Each additional time the repeat-until loop is repeated, there is a 
pair of repetitive solid pieces that becomes a pair of fixed solid pieces at Line 10 of the previous 
pass of the repeat-until loop. This can happen at most k times. 

Second, note that the while loop of Lines 4-5 is iterated at most 2k times in each repetition of 
the other repeat-until loop of frames: each rule creates exactly one frame, and, by Observation [T] 
there are at most 2k — 2 fragile pieces. 

Hence, there are at most 4k 2 times in which one of the frame rules at Line 5 creates branches 
and at most k times in which branches are created at Line 10. The only frame rules that perform 
branchings are Frame Rules [3] and [5] In both cases, the rule branches into at most 2k cases, 
since each cycle has at most k solid vertices and thus at most 2k vertices edges in the cycle 
under consideration. Hence, the branchings performed by the frame rules increase the running 
time by a factor of 0((2k) ik ). Each of the at most k branchings in Line 10 is among at 
most (12k) 2 + k choices (Lemma [6b. Hence, these branchings increase the running time by 
a factor of 0((12k) 2k ■ k k ). Hence, the overall increase due to the branching is by a factor 
of (2k) 4k2 ■ k°^; all other steps can be performed in polynomial time. 
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6 Conclusion 



An improvement of the so far very impractical running time is desirable; the bottleneck appears 
to that some the frame rules still have to branch. Furthermore, it would be interesting to see 
whether our result for MCSP can be extended to the "signed" MCSP [3 [5j [TO] where each 
marker is annotated with a direction and one may reverse blocks before matching. 
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